A Greedy Divide-and-Conquer Approach to Optimizing Large Manufacturing Systems using Reinforcement Learning

نویسندگان

  • Gang Wang
  • Sridhar Mahadevan
چکیده

Manufacturing is a challenging real-world domain for studying hierarchical MDP-based optimization algorithms. We have recently obtained very promising results using a hierarchical reinforcement learning based optimization algorithm for a 12-machine transfer line. Transfer lines model factory processes in automobile and many other product assembly plants. Unlike domains such as elevator scheduling, where the individual elevator \agents" are eeectively independent , individual machines in a transfer line are directly aaected by each others behavior. This interaction creates a highly non-stationary environment that defeats ""at" learning algorithms that train each machine while they are part of the line. Our hierarchical optimization algorithm comprises of an average-reward Q-learning algorithm for semi-Markov decision processes called SMART for learning low-cost policies for operating each individual machine, and a greedy algorithm for combining the base level policies to obtain an overall policy for running the transfer line. Unlike other approaches to hierarchical optimization, in our system policies at the lowest level are modeled using semi-Markov decision processes (SMDPs). In our results to date, we can show that the hierarchical approach is not only much faster than the ""at" algorithm, but also appears to outperform well-known heuristics for running transfer lines used in many factories today.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Production Manufacturing Using Reinforcement Learning

Many industrial processes involve making parts with an assembly of machines, where each machine carries out an operation on a part, and the finished product requires a whole series of operations. A well-studied example of such a factory structure is the transfer line, which involves a sequence of machines. Optimizing transfer lines has been a subject of much study in the industrial engineering ...

متن کامل

Clustering for Data Reduction: A Divide and Conquer Approach

We consider the problem of reducing a potentially very large dataset to a subset of representative prototypes. Rather than searching over the entire space of prototypes, we first roughly divide the data into balanced clusters using bisecting k-means and spectral cuts, and then find the prototypes for each cluster by affinity propagation. We apply our algorithm to text data, where we perform an ...

متن کامل

Free Vibration Analysis of Repetitive Structures using Decomposition, and Divide-Conquer Methods

This paper consists of three sections. In the first section an efficient method is used for decomposition of the canonical matrices associated with repetitive structures. to this end, cylindrical coordinate system, as well as a special numbering scheme were employed. In the second section, divide and conquer method have been used for eigensolution of these structures, where the matrices are in ...

متن کامل

Evolutionary Divide and Conquer (II) for the TSP

Results presented in recent papers demonstrate that it is possible to produce high quality solutions to TSP instances of up to several hundred cities using simple greedy heuristics when a Genetic Algorithm (GA) is used to perturb the city coordinates. The present paper extends the earlier studies to larger problems and a divide and conquer algorithm in the style of Richard Karp. Using a GA to p...

متن کامل

Divide-and-Conquer Reinforcement Learning

Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit considerable initial state variation typically produce high-variance gradient estimates for model-free RL, making direct policy or value function optimization cha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998